Assignment Overview
You do not need to replicate ALL of the analyses presented in the paper, but at minimum you must replicate at least 3 analyses, including at least one descriptive statistical analysis and one inferential statistical analysis. As part of this assignment, you must also replicate to the best of your abilities at least one figure.
“Gregariousness, foraging effort, and affiliative interactions in lactating bonobos and chimpanzees” 2021 - Sean M. Lee, Gottfried Hohmann, Elizabeth V. Lonsdorf, Barbara Fruth,& Carson M. Murraya
This study aimed to investigate the cost of fission-fusion social dynamics on lactating bonobos and chimpanzees given their well-established differences in gregariousness and feeding competition. Lactation is energetically expensive and often requires females to alter their energy expenditure and intake in order to maintain the necessary positive energy balance. The authors predicted that lactating chimpanzees offset the cost of lactation by being less gregarious and spending more time alone, thus mitigating potential costs of intense feeding competition. They also predicted that less gregariousness in lactating chimpanzees would constrain their social budget, resulting in less affiliative interactions overall. The study specifically looked at a community of chimpanzees from Gombe, Tanzania and the Bompusa West bonobo community at LuiKotale, Democratic Republic of the Congo. They collected party scans and detailed behavioral data during focal follows, standardizing data collection across both research sites for easier analysis.
In brief the three predictions tested in this study were:
1. Lactating chimpanzees spend more time alone with their immature offspring than do lactating bonobos
2. Lactating females of the two species do not differ in feeding or travel time
3. Lactating bonobos spend more time engage in social interactions, particularly with individuals other than their immature offspring.
Data on lactating females was pooled into three age bins based on the age of their youngest dependent offspring (0 < 1.5, 1.5 < 3, and 3 < 4.5 years). These categories were included in months within their raw data. To test their predictions, generalized linear mixed models with beta-binomial error structure were fit using the package {glmmTMB} and nonparametric dispersion tests were run with testDispersion function from {DHARMa}. Parameter interactions/significance were tested using the Anova function from {car} and fit models were visually assessed with plotted residuals and QQ plots.
GLMM analysis revealed that in support of the first two predictions, lactating chimpanzees were less gregarious (spent more time alone) than lactating female bonobos. However, they also found lactating female chimpanzees and bonobos did not differ in their social interaction time, and that lactating chimpanzees actually spent proportionally more time affiliating with others. These results suggest that lactating chimpanzees do mitigate feeding competition by being less gregarious, but are still able to maintain their foraging budgets compared to bonobos. I found these results both interesting and surprising, especially because the authors did not provide any clear explanation for why bonobos are more gregarious yet, in this study, do not engage in as much social interactions when compared to lactating chimpanzees. The results of this study complicate the differences between chimpanzees and bonobos, which are often oversimplified in the popular literature. This topic requires closer investigation to better understand the nuances of fission-fusion societies and the formation/maintenance of social relationships.
I first selected an article titled, “Attractiveness of female sexual signaling predicts differences in female grouping patterns between bonobos and chimpanzees” by Surbeck et al. (2021) to replicate for this assignment. I found the study and their results extremely interesting, especially since I recently heard Surbeck present his research at NEEP and have long heard the idea that bonobos are more gregarious than chimps because of ecological differences. However, as I read through the paper and began to try and manipulate the dataset they shared on Dryad, I realized I was missing a lot of the details necessary to run the models they did in their analyses. The results and methodology sections at the end of the paper fail to include the parameters of their GLMMs, making it nearly impossible for me to try and understand and replicate - especially because this kind of modeling is brand new to me.
I then decided to look back over the other articles I had considered earlier for the assignment. This article (Lee et al. 2021) covered a somewhat similar topic while including much more detailed descriptions of their analyses! Though they explained their parameters/analyses fairly well, I still ran into some fun challenges because the authors fit GLMM using {glmmTMB}, which there is frustratingly little information on! Even after all the time I spent learning about glmms and reading about glmmTMB, I still don’t quite understand why they used glmmTMB and not glmer or a more popular glmm function/package. It seems that glmmTMB is preferred when working with zero-inflated models but this analysis did not use such models? Regardless, once I figured out how to run the first GLMM using glmmTMB, the rest of the analyses were fairly easy. I ended up replicating the two main tables from the article (Tables 2, and Tables 3) after fitting/running multiple glmmTMB models, as well as all 5 figures from the raw data. I have inserted images of the figures/tables from Lee et al. (2021) following my own replication of each in order to compare.
My first step of my replication analysis involved loading in the raw data published to Dryad by the article authors. The Dryad data was saved as an Excel file with 2 sheets: one with individual chimp/bonobo data on time spent ‘feeding, traveling, social, social interaction adjusted’ and the other with individual time spent ‘alone.’ I exported each individual Excel sheet into separate .csv files before uploading them to GitHub in order to curl the data into my R workspace.
#loading in 'alone' data
a <- curl("https://raw.githubusercontent.com/vzdanowicz/AN588_DataReplication_vrz/main/lee_alone.csv")
time_alone <- read.csv(a, header = TRUE, sep = ",", stringsAsFactors = FALSE)
names(time_alone)<-c('species','ID','age','total','alone') #updating frame column names for sanity
#loading in 'feed/travel/social' data
b <- curl("https://raw.githubusercontent.com/vzdanowicz/AN588_DataReplication_vrz/main/lee_behav.csv")
feed_travel_social <- read.csv(b, header = TRUE, sep = ",", stringsAsFactors = FALSE)
names(feed_travel_social)<-c('species','ID','age','total','feed','travel','social','social_adj')
#view
head(time_alone)
## species ID age total alone
## 1 bonobo Gwe 0--18 68 0
## 2 bonobo Iri 18--36 88 0
## 3 bonobo Iri 36--54 40 1
## 4 bonobo Nin 18--36 110 0
## 5 bonobo Olg 0--18 70 0
## 6 bonobo Olg 36--54 84 1
head(feed_travel_social)
## species ID age total feed travel social social_adj
## 1 bonobo Dju 0--18 1618 662 316 272 137
## 2 bonobo Gwe 0--18 3493 1415 386 720 334
## 3 bonobo Iri 0--18 2794 993 471 579 307
## 4 bonobo Nin 0--18 2621 1089 370 367 52
## 5 bonobo Olg 0--18 1425 383 202 95 5
## 6 bonobo Uma 0--18 2122 707 400 367 155
The article contains a fairly detailed description of their statistical analyses in R. I used the help() function to learn more about the {glmmTMB} package and functions used by these authors. The glmmTMB function fits generalized linear mixed models following lme4 syntax and using Template Model Builder (TMB). Below are some of the notes I kept track of while researching and trying to wrap my head around the TMB package since its a different version than glmer GLMMs.
formula,data = NULL,family = gaussian(),ziformula = ~0,dispformula = ~1,weights = NULL,offset = NULL,contrasts = NULL,na.action, se = TRUE,verbose = FALSE,doFit = TRUE,control = glmmTMBControl(),REML = FALSE,start = NULL,map = NULL,sparseX = NULL
Exploring & visualizing the datasets..
par(mfrow = c(1,2))
boxplot(data = time_alone, alone ~ age * species, group = time_alone$age)
boxplot(data = feed_travel_social, feed ~ age * species, col = c('cadetblue1','darkseagreen1'))
I was trying to get some visualizations of the entire dataset but was having a hard time without manipulating any of the raw data. The authors used summarized values like mean/se in their figures so I will go through and replicate that later on.
I was not sure if I needed to convert some of the frame values from characters into factors so I did it anyway to be safe! I did this by using the as.factor function and specifying the variables to coerce.
#convert column 'id' from character to factor
time_alone$ID <- as.factor(time_alone$ID)
time_alone$species <- as.factor(time_alone$species)
time_alone$age <- as.factor(time_alone$age)
str(time_alone)
## 'data.frame': 30 obs. of 5 variables:
## $ species: Factor w/ 2 levels "bonobo","chimpanzee": 1 1 1 1 1 1 1 1 1 1 ...
## $ ID : Factor w/ 19 levels "BAH","DL","EZA",..: 9 10 10 11 12 12 13 13 14 15 ...
## $ age : Factor w/ 3 levels "0--18","18--36",..: 1 2 3 2 1 3 1 2 1 3 ...
## $ total : int 68 88 40 110 70 84 48 27 30 119 ...
## $ alone : int 0 0 1 0 0 1 2 2 0 0 ...
str shows that I successfully manipulated the dataframe and all the variables are now either factors or integers. The factor levels come into play when grouping data for replicating the article figures.
Below, I break down the replication analysis by the three main predictions as outlined above. For each prediction, I fit a generalized linear mixed model using glmmTMB (with a beta-binomial error structure) to test the interaction of species and infant age before fitting an independent fixed-effects model. The authors included in the article that if the interaction of species/infant age was significant that they conducted Tukey’s pairwise post hoc comparison yet the step was not required for any of the tested predictions. In addition to replicating the models and figures, I also replicated their dispersion analysis with the testDispersion function from {DHARMa}and was able to match my dispersion values with theirs.
Prediction: Lactating female chimpanzees spend more time alone and are less gregarious than lactating bonobos
Lee et al. (2021) report that the response variable for this first model was calculated by dividing the number of scans of a given female by the total number of party scans collected on that female during that age class. Thankfully, the data was already organized by each species, female ID, and infant age class, so I only had to manipulate my dataframe time_alone to calculate the necessary response variable.
To do this - I first tried creating a new column in time_alone and then wrote a for loop that filled the appropriate proportion values into this new column using the code below.
time_alone$prop_alone <- NULL
for (i in 1:nrow(time_alone)) {
time_alone$prop_alone[i] <- time_alone$alone_scans[i]/time_alone$total_scans[i]
}
After running through that code chunk originally, I got my response variable and began trying to work through the GLMM. I started this process by carefully reading through the article to understand the parameters used, as well as read the {glmmTMB} package information and ecological GLMM examples from our course supplementary readings. I also had to Google what exactly a beta-binomial model meant (as described in the source article).
First I tested the interaction between species and infant age class (as outlined in the article). I first tried to create a full model that included the interaction and then a reduced model with both as independent fixed-effects and test the relationship with Anova.. like we had done in Mixed Effects module. But I kept getting the error message Error in fixef(mod)[[component]] : invalid subscript type ‘list’
I then realized Anova() function from {car} can actually directly test interactions within a single model. So I paired down to just have P1M1 model and then tested the significance of the interaction using Anova() Wald “Chisq” with type “III”. I also repeatedly got the wrong output to my model with the code above, it was printing all the variables and looked nothing like the article outputs. After lots more reading and messing around, I realized I could make the response variable the actual proportion of alone scans and total scans as long as I included weight = total (so weight with the denominator of the proportion of the response variable).
I could not figure out why my results did not look like those in the article… the values of the model above were close but didnt seem to be giving thee right output. I continued to read about the glmmTMB function and package to try and understand what piece I was missing. I eventually figured out that in order to have a proportion (non binary/Bernoulli) response variable in the model, it needs to be weighted. The {glmmTMB} package says: “Binomial models with more than one trial (i.e., not binary/Bernoulli) can either be specified in the form prob ~…,weights = N, or in the more typical two-column matrix cbind(successes,failures)~… form” (Magnusson et al. 2021). I found an example of someone doing this in a question thread online. They kept the response variable as the proportion without creating an additional column in their data frame and used the denominator as the weight. After attempting that version of the model, my output was finally identical to Lee et al. (2021)!! Unfortunately I didn’t keep the trial/error code because it complicated my knitting/object names but it was quite the process.
Model testing the interaction of
speciesand infantage
Below is the first model for prediction 1, testing the interaction effect of species and age to create the glmmTMB object, P1M1, and testing the significance with Anova. In all models I created, lactating female identity, ID, was added as random effect.
P1M1 <- glmmTMB(alone/total ~ species * age + (1 | ID), data = time_alone, family = betabinomial(link = "logit"), weight = total)
P1M1_anova <- Anova(P1M1, type = c("III"), test.statistic = "Chisq")
Before moving on, I made sure to compare the summary statistics of this model with the results from the article to ensure I included the same parameters and got similar enough numbers. I actually was able to exactly replicate their numbers for the models and model testing! After creating my replications of Table 3 and Table 4, we can more easily look at/compare my results with those of Lee et al. (2021).
summary(P1M1)
## Family: betabinomial ( logit )
## Formula: alone/total ~ species * age + (1 | ID)
## Data: time_alone
## Weights: total
##
## AIC BIC logLik deviance df.resid
## 151.8 163.0 -67.9 135.8 22
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## ID (Intercept) 1.009e-08 0.0001005
## Number of obs: 30, groups: ID, 19
##
## Dispersion parameter for betabinomial family (): 13.6
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.9000 0.7189 -5.425 5.8e-08 ***
## specieschimpanzee 2.5870 0.7646 3.383 0.000716 ***
## age18--36 -0.8451 1.2221 -0.692 0.489236
## age36--54 0.5001 0.9182 0.545 0.585949
## specieschimpanzee:age18--36 0.6804 1.3018 0.523 0.601210
## specieschimpanzee:age36--54 -0.8054 1.0334 -0.779 0.435795
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(P1M1_anova)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: alone/total
## Chisq Df Pr(>Chisq)
## (Intercept) 29.4281 1 5.803e-08 ***
## species 11.4475 1 0.0007159 ***
## age 1.3684 2 0.5044876
## species:age 1.5098 2 0.4700581
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Overall, this model (P1M1) and model testing revealed that the interaction between species and infant age was not significant - under the ’Response: alone/total" part of the anova output we can see that “species:age” p-val is 0.470. Thus, I next had to refit the model to include species and infant age as independent fixed-effects. Before running the next model, I pulled out the chisq x2, df, and p val from the Anova using tidy function from {broom} in order to more easily compare and confirm my numbers with the outputs from Lee et al. (2021).
#my results P1M1
P1M1_results <- tidy(P1M1_anova)
P1M1_results_x2 <- P1M1_results$statistic[4]
P1M1_results_df <- P1M1_results$df[4]
P1M1_results_p <- P1M1_results$p.value[4]
#article results P1M1
Lee_P1M1_x2 <- c(1.510)
Lee_P1M1_df <- c(2)
Lee_P1M1_p <- c(0.470)
Version <- c("Replication", "Original")
P1M1_x2 <- c(P1M1_results_x2, Lee_P1M1_x2)
P1M1_df <- c(P1M1_results_df, Lee_P1M1_df)
P1M1_p <- c(P1M1_results_p, Lee_P1M1_p)
P1M1_compare <- data.frame(Version,P1M1_x2,P1M1_df,P1M1_p)
P1M1_compare %>%
kbl() %>%
kable_styling()
| Version | P1M1_x2 | P1M1_df | P1M1_p |
|---|---|---|---|
| Replication | 1.509798 | 2 | 0.4700581 |
| Original | 1.510000 | 2 | 0.4700000 |
The only difference in the values from this first model/test are because of rounding, but you can see the models/test had the same output.
Interaction was not significant so I then needed to model species and infant age as independent fixed-effects. I used about the same model function but changed species * age to species + age and used a type II Anova instead of the type III test needed for interaction affects.
P1M2 <- glmmTMB(alone/total ~ species + age + (1 | ID), data = time_alone, family = betabinomial(link = "logit"), weight = total)
P1M2_anova <- Anova(P1M2, type = c("II"), test.statistic = "Chisq")
I again checked the results if the model summary and Anova below matched Lee et al.(2021). And they do! We can see from the Anova results that species had a significant affect on the model with a very low p val.
summary(P1M2)
## Family: betabinomial ( logit )
## Formula: alone/total ~ species + age + (1 | ID)
## Data: time_alone
## Weights: total
##
## AIC BIC logLik deviance df.resid
## 149.4 157.8 -68.7 137.4 24
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## ID (Intercept) 4.269e-08 0.0002066
## Number of obs: 30, groups: ID, 19
##
## Dispersion parameter for betabinomial family (): 13.9
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.8044 0.5085 -7.481 7.37e-14 ***
## specieschimpanzee 2.4699 0.4814 5.130 2.89e-07 ***
## age18--36 -0.2671 0.4194 -0.637 0.524
## age36--54 -0.1396 0.4048 -0.345 0.730
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(P1M2_anova)
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: alone/total
## Chisq Df Pr(>Chisq)
## species 26.3209 1 2.891e-07 ***
## age 0.4135 2 0.8132
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I next pulled out the summary statistics from the Anova in order to more easily compare to the original.
#my results P1M2
P1M2_results <- tidy(P1M2_anova)
#pulling out values for species
P1M2_SPECIES_x2 <- P1M2_results$statistic[1]
P1M2_SPECIES_df <- P1M2_results$df[1]
P1M2_SPECIES_p <- P1M2_results$p.value[1]
#pulling out values for age
P1M2_AGE_x2 <- P1M2_results$statistic[2]
P1M2_AGE_df <- P1M2_results$df[2]
P1M2_AGE_p <- P1M2_results$p.value[2]
#article results P1M2
Lee_species_x2 <- c(26.321)
Lee_species_df <- c(1)
Lee_species_p <- c(0.001)
Lee_age_x2 <- c(0.414)
Lee_age_df <- c(1)
Lee_age_p <- c(0.813)
Version <- c("Replication", "Original")
Species_x2 <- c(P1M2_SPECIES_x2, Lee_species_x2)
Species_df <- c(P1M2_SPECIES_df, Lee_species_df)
Species_p <- c(P1M2_SPECIES_p, Lee_species_p)
P1M2_SPcompare <- data.frame(Version,Species_x2,Species_df,Species_p)
Age_x2 <- c(P1M2_AGE_x2, Lee_age_x2)
Age_df <- c(P1M2_AGE_df, Lee_age_df)
Age_p <- c(P1M2_AGE_p, Lee_age_p)
P1M2_AGEcompare <- data.frame(Version,Age_x2,Age_df,Age_p)
P1M2_SPcompare %>%
kbl() %>%
kable_styling()
| Version | Species_x2 | Species_df | Species_p |
|---|---|---|---|
| Replication | 26.32085 | 1 | 3e-07 |
| Original | 26.32100 | 1 | 1e-03 |
P1M2_AGEcompare %>%
kbl() %>%
kable_styling()
| Version | Age_x2 | Age_df | Age_p |
|---|---|---|---|
| Replication | 0.4135445 | 2 | 0.8132048 |
| Original | 0.4140000 | 1 | 0.8130000 |
After confirming my models and model tests above, I moved on to running dispersion tests for both the interaction model (P1M1) and independent model (P1M2).
Before running the dispersion tests I had to spend some time reading about the {DHARMa} package and what simulating residuals will look like to code and plot. The article included the deviance ratios and p-vals for the dispersion tests but did not include any of the visualizations or further explanations. The code below goes through the residual simulation using the P1M1 model.
P1M1_sim <- simulateResiduals(fittedModel = P1M1, plot = TRUE) #default is 250, authors may have increased because #s slightly dif
testDispersion(P1M1_sim)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.87923, p-value = 0.96
## alternative hypothesis: two.sided
I also used testDispersion for the independent model for prediction one, P1M2
P1M2_sim <- simulateResiduals(fittedModel = P1M2, plot = TRUE) #default is 250, authors may have increased because #s slightly dif
testDispersion(P1M2_sim)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.97202, p-value = 0.896
## alternative hypothesis: two.sided
My values are slightly different than those from the article for both model dispersion tests, but that’s to be expected when running a simulation.
Version <- c("Replication", "Original")
P1M1_Disp <- c(0.87923, 0.957)
P1M2_Disp <- c(0.96, 0.960)
P1M1_P <- c(0.972, 1.002)
P1M2_P <- c(0.896, 0.928)
P1_Dispersion <- data.frame(Version, P1M1_Disp, P1M1_P, P1M2_Disp, P1M2_P)
P1_Dispersion %>%
kbl() %>%
kable_styling()
| Version | P1M1_Disp | P1M1_P | P1M2_Disp | P1M2_P |
|---|---|---|---|---|
| Replication | 0.87923 | 0.972 | 0.96 | 0.896 |
| Original | 0.95700 | 1.002 | 0.96 | 0.928 |
These numbers do not exactly match up but show the same results that the models we fit do not show any significant overdispersion.
Prediction: Lactating females chimpanzees and bonobos do not differ in feeding/travel time
Lee et al. (2021) report that the response variable for the models to test Prediction 2 was calculated by dividing the number of point scans of a given female engaged in feeding or travel, during each infant age class, by the total number of scans collected on that female. The rest of the models and figures 2-5 all involve the second dataframe, feed_travel_social. Similarly, before getting started with the models for Prediction 2 I first changed character values to factor in the other data.
#convert column 'id' from character to factor
feed_travel_social$ID <- as.factor(feed_travel_social$ID)
feed_travel_social$species <- as.factor(feed_travel_social$species)
feed_travel_social$age <- as.factor(feed_travel_social$age)
str(feed_travel_social)
## 'data.frame': 34 obs. of 8 variables:
## $ species : Factor w/ 2 levels "bonobo","chimpanzee": 1 1 1 1 1 1 1 1 1 1 ...
## $ ID : Factor w/ 26 levels "BAH","Dju","DL",..: 2 11 13 15 17 24 25 26 19 22 ...
## $ age : Factor w/ 3 levels "0--18","18--36",..: 1 1 1 1 1 1 1 1 2 2 ...
## $ total : int 1618 3493 2794 2621 1425 2122 2717 1314 2196 1969 ...
## $ feed : int 662 1415 993 1089 383 707 1198 604 969 834 ...
## $ travel : int 316 386 471 370 202 400 586 225 490 478 ...
## $ social : int 272 720 579 367 95 367 366 140 279 223 ...
## $ social_adj: int 137 334 307 52 5 155 122 57 93 55 ...
Like with the models for Prediction 1, I started off by testing the interaction between species and infant age. The function below is about the same at the one used in the P1M1 interaction model other than adjustments in the variables being called since I am now modeling the proportion of time spent feeding and the influence of species*age.
P2M1 <- glmmTMB(feed/total ~ species * age + (1 | ID), data = feed_travel_social, family = betabinomial(link = "logit"), weight = total)
P2M1_anova <- Anova(P2M1, type = c("III"), test.statistic = "Chisq")
summary(P2M1)
## Family: betabinomial ( logit )
## Formula: feed/total ~ species * age + (1 | ID)
## Data: feed_travel_social
## Weights: total
##
## AIC BIC logLik deviance df.resid
## 455.3 467.5 -219.7 439.3 26
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## ID (Intercept) 0.03392 0.1842
## Number of obs: 34, groups: ID, 26
##
## Dispersion parameter for betabinomial family (): 56.6
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.46398 0.11563 -4.013 6e-05 ***
## specieschimpanzee -0.11293 0.16074 -0.703 0.4823
## age18--36 0.11751 0.18741 0.627 0.5306
## age36--54 0.36437 0.17785 2.049 0.0405 *
## specieschimpanzee:age18--36 0.53117 0.28383 1.871 0.0613 .
## specieschimpanzee:age36--54 -0.05361 0.24885 -0.215 0.8294
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(P2M1_anova)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: feed/total
## Chisq Df Pr(>Chisq)
## (Intercept) 16.1019 1 6.003e-05 ***
## species 0.4936 1 0.4823
## age 4.1999 2 0.1225
## species:age 4.3585 2 0.1131
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#my results P2M1
P2M1_results <- tidy(P2M1_anova)
P2M1_results_x2 <- P2M1_results$statistic[4]
P2M1_results_df <- P2M1_results$df[4]
P2M1_results_p <- P2M1_results$p.value[4]
#article results P2M1
Lee_P2M1_x2 <- c(4.359)
Lee_P2M1_df <- c(2)
Lee_P2M1_p <- c(0.113)
Version <- c("Replication", "Original")
P2M1_x2 <- c(P2M1_results_x2, Lee_P2M1_x2)
P2M1_df <- c(P2M1_results_df, Lee_P2M1_df)
P2M1_p <- c(P2M1_results_p, Lee_P2M1_p)
P2M1_compare <- data.frame(Version,P2M1_x2,P2M1_df,P2M1_p)
P2M1_compare %>%
kbl() %>%
kable_styling()
| Version | P2M1_x2 | P2M1_df | P2M1_p |
|---|---|---|---|
| Replication | 4.358457 | 2 | 0.1131288 |
| Original | 4.359000 | 2 | 0.1130000 |
I again checked the results of the model summary and Anova matched Lee et al.(2021). And they do! We can see from the Anova results above that the interaction between species and infant age was not significant when looking at proportion of time spent feeding. The next step is to refit the model with species and age as independent fixed-effects.
Interaction was not significant in the model testing proportion of time spent feeding so I next needed to model species and infant age as independent fixed-effects for the feeding data. I used about the same model function but changed species * age to species + age and used a type II Anova instead of the type III test needed for interaction affects.
P2M2 <- glmmTMB(feed/total ~ species + age + (1 | ID), data = feed_travel_social, family = betabinomial(link = "logit"), weight = total)
P2M2_anova <- Anova(P2M2, type = c("II"), test.statistic = "Chisq")
summary(P2M2)
## Family: betabinomial ( logit )
## Formula: feed/total ~ species + age + (1 | ID)
## Data: feed_travel_social
## Weights: total
##
## AIC BIC logLik deviance df.resid
## 455.4 464.5 -221.7 443.4 28
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## ID (Intercept) 0.028 0.1673
## Number of obs: 34, groups: ID, 26
##
## Dispersion parameter for betabinomial family (): 45.1
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.5071 0.1088 -4.662 3.12e-06 ***
## specieschimpanzee -0.0226 0.1256 -0.180 0.8572
## age18--36 0.3544 0.1512 2.344 0.0191 *
## age36--54 0.3189 0.1323 2.411 0.0159 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(P2M2_anova)
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: feed/total
## Chisq Df Pr(>Chisq)
## species 0.0324 1 0.85721
## age 8.3784 2 0.01516 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
age has significant effect on feeding model (supported by article)
#my results P2M2
P2M2_results <- tidy(P2M2_anova)
#pulling out values for species
P2M2_SPECIES_x2 <- P2M2_results$statistic[1]
P2M2_SPECIES_df <- P2M2_results$df[1]
P2M2_SPECIES_p <- P2M2_results$p.value[1]
#pulling out values for age
P2M2_AGE_x2 <- P2M2_results$statistic[2]
P2M2_AGE_df <- P2M2_results$df[2]
P2M2_AGE_p <- P2M2_results$p.value[2]
#article results P2M2
Lee_species_x2 <- c(0.032)
Lee_species_df <- c(1)
Lee_species_p <- c(0.857)
Lee_age_x2 <- c(8.379)
Lee_age_df <- c(2)
Lee_age_p <- c(0.015)
Version <- c("Replication", "Original")
Species_x2 <- c(P2M2_SPECIES_x2, Lee_species_x2)
Species_df <- c(P2M2_SPECIES_df, Lee_species_df)
Species_p <- c(P2M2_SPECIES_p, Lee_species_p)
P2M2_SPcompare <- data.frame(Version,Species_x2,Species_df,Species_p)
Age_x2 <- c(P2M2_AGE_x2, Lee_age_x2)
Age_df <- c(P2M2_AGE_df, Lee_age_df)
Age_p <- c(P2M2_AGE_p, Lee_age_p)
P2M2_AGEcompare <- data.frame(Version,Age_x2,Age_df,Age_p)
P2M2_SPcompare %>%
kbl() %>%
kable_styling()
| Version | Species_x2 | Species_df | Species_p |
|---|---|---|---|
| Replication | 0.0323731 | 1 | 0.8572112 |
| Original | 0.0320000 | 1 | 0.8570000 |
P2M2_AGEcompare %>%
kbl() %>%
kable_styling()
| Version | Age_x2 | Age_df | Age_p |
|---|---|---|---|
| Replication | 8.378434 | 2 | 0.0151581 |
| Original | 8.379000 | 2 | 0.0150000 |
After confirming my models and model tests above, I moved on to running dispersion tests for both the interaction model (P2M1) and independent model (P2M2).
To look at nonparametric dispersion tests using the testDispersion() function from {DHARMa}… first need to simulateResiduals
P2M1_sim <- simulateResiduals(fittedModel = P2M1, plot = TRUE) #feeding interaction
testDispersion(P2M1_sim)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 1.1285, p-value = 0.568
## alternative hypothesis: two.sided
P2M2_sim <- simulateResiduals(fittedModel = P2M2, plot = TRUE) #feeding independent
testDispersion(P2M2_sim)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 1.3363, p-value = 0.136
## alternative hypothesis: two.sided
Version <- c("Replication", "Original")
P2M1_Disp <- c(1.1414, 1.066)
P2M2_Disp <- c(1.2964, 1.146)
P2M1_P <- c(0.584, 0.496)
P2M2_P <- c(0.176, 0.216)
P2M1.M2_Dispersion <- data.frame(Version, P2M1_Disp, P2M1_P, P2M2_Disp, P2M2_P)
P2M1.M2_Dispersion %>%
kbl() %>%
kable_styling()
| Version | P2M1_Disp | P2M1_P | P2M2_Disp | P2M2_P |
|---|---|---|---|---|
| Replication | 1.1414 | 0.584 | 1.2964 | 0.176 |
| Original | 1.0660 | 0.496 | 1.1460 | 0.216 |
Like before, these numbers do not exactly match up but show the same results that the fitted models do not show any significant overdispersion.
The other part of the second prediction involves fitting a model for the proportion of time spent traveling. I first modeled P2M3 to include the interaction of species and age.
P2M3 <- glmmTMB(travel/total ~ species * age + (1 | ID), data = feed_travel_social, family = betabinomial(link = "logit"), weight = total)
P2M3_anova <- Anova(P2M3, type = c("III"), test.statistic = "Chisq")
summary(P2M3)
## Family: betabinomial ( logit )
## Formula: travel/total ~ species * age + (1 | ID)
## Data: feed_travel_social
## Weights: total
##
## AIC BIC logLik deviance df.resid
## 391.5 403.7 -187.8 375.5 26
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## ID (Intercept) 0.02284 0.1511
## Number of obs: 34, groups: ID, 26
##
## Dispersion parameter for betabinomial family (): 303
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.59316 0.07804 -20.415 <2e-16 ***
## specieschimpanzee -0.12538 0.11473 -1.093 0.274
## age18--36 0.14736 0.16062 0.917 0.359
## age36--54 0.15403 0.12064 1.277 0.202
## specieschimpanzee:age18--36 0.20379 0.25869 0.788 0.431
## specieschimpanzee:age36--54 -0.06929 0.16406 -0.422 0.673
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(P2M3_anova)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: travel/total
## Chisq Df Pr(>Chisq)
## (Intercept) 416.7576 1 <2e-16 ***
## species 1.1943 1 0.2745
## age 2.6811 2 0.2617
## species:age 0.8496 2 0.6539
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#my results P2M3
P2M3_results <- tidy(P2M3_anova)
P2M3_results_x2 <- P2M3_results$statistic[4]
P2M3_results_df <- P2M3_results$df[4]
P2M3_results_p <- P2M3_results$p.value[4]
#article results P2M3
Lee_P2M3_x2 <- c(0.850)
Lee_P2M3_df <- c(2)
Lee_P2M3_p <- c(0.654)
Version <- c("Replication", "Original")
P2M3_x2 <- c(P2M3_results_x2, Lee_P2M3_x2)
P2M3_df <- c(P2M3_results_df, Lee_P2M3_df)
P2M3_p <- c(P2M3_results_p, Lee_P2M3_p)
P2M3_compare <- data.frame(Version,P2M3_x2,P2M3_df,P2M3_p)
P2M3_compare %>%
kbl() %>%
kable_styling()
| Version | P2M3_x2 | P2M3_df | P2M3_p |
|---|---|---|---|
| Replication | 0.8495625 | 2 | 0.6539128 |
| Original | 0.8500000 | 2 | 0.6540000 |
I again checked the results of the model summary and Anova matched Lee et al.(2021). And they do! We can see from the Anova results above that the interaction between species and infant age was not significant when looking at proportion of time spent traveling. The slight difference between my replication and the original is just due to rounding. The next step is to refit the travel model with species and age as independent fixed-effects.
I now modeled proportion of time traveling with age and species as independent fixed-effects.
P2M4 <- glmmTMB(travel/total ~ species + age + (1 | ID), data = feed_travel_social, family = betabinomial(link = "logit"), weight = total)
P2M4_anova <- Anova(P2M4, type = c("II"), test.statistic = "Chisq")
summary(P2M4)
## Family: betabinomial ( logit )
## Formula: travel/total ~ species + age + (1 | ID)
## Data: feed_travel_social
## Weights: total
##
## AIC BIC logLik deviance df.resid
## 388.5 397.6 -188.2 376.5 28
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## ID (Intercept) 0.01065 0.1032
## Number of obs: 34, groups: ID, 26
##
## Dispersion parameter for betabinomial family (): 203
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.60571 0.06880 -23.338 < 2e-16 ***
## specieschimpanzee -0.09255 0.08014 -1.155 0.24818
## age18--36 0.25062 0.09433 2.657 0.00789 **
## age36--54 0.10520 0.08471 1.242 0.21428
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(P2M4_anova)
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: travel/total
## Chisq Df Pr(>Chisq)
## species 1.3335 1 0.24818
## age 7.1529 2 0.02798 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
age has significant effect on feeding model (supported by article)
#my results P2M4
P2M4_results <- tidy(P2M4_anova)
#pulling out values for species
P2M4_SPECIES_x2 <- P2M4_results$statistic[1]
P2M4_SPECIES_df <- P2M4_results$df[1]
P2M4_SPECIES_p <- P2M4_results$p.value[1]
#pulling out values for age
P2M4_AGE_x2 <- P2M4_results$statistic[2]
P2M4_AGE_df <- P2M4_results$df[2]
P2M4_AGE_p <- P2M4_results$p.value[2]
#article results P2M4
Lee_species_x2 <- c(1.334)
Lee_species_df <- c(1)
Lee_species_p <- c(0.248)
Lee_age_x2 <- c(7.153)
Lee_age_df <- c(2)
Lee_age_p <- c(0.028)
Version <- c("Replication", "Original")
Species_x2 <- c(P2M4_SPECIES_x2, Lee_species_x2)
Species_df <- c(P2M4_SPECIES_df, Lee_species_df)
Species_p <- c(P2M4_SPECIES_p, Lee_species_p)
P2M4_SPcompare <- data.frame(Version,Species_x2,Species_df,Species_p)
Age_x2 <- c(P2M4_AGE_x2, Lee_age_x2)
Age_df <- c(P2M4_AGE_df, Lee_age_df)
Age_p <- c(P2M4_AGE_p, Lee_age_p)
P2M4_AGEcompare <- data.frame(Version,Age_x2,Age_df,Age_p)
P2M4_SPcompare %>%
kbl() %>%
kable_styling()
| Version | Species_x2 | Species_df | Species_p |
|---|---|---|---|
| Replication | 1.333499 | 1 | 0.2481836 |
| Original | 1.334000 | 1 | 0.2480000 |
P2M4_AGEcompare %>%
kbl() %>%
kable_styling()
| Version | Age_x2 | Age_df | Age_p |
|---|---|---|---|
| Replication | 7.152864 | 2 | 0.0279753 |
| Original | 7.153000 | 2 | 0.0280000 |
After confirming my models and model tests above, I moved on to running dispersion tests for both the interaction model (P2M3) and independent model (P2M4) for the travel models.
To look at nonparametric dispersion tests using the testDispersion() function from {DHARMa}… first need to simulateResiduals
P2M3_sim <- simulateResiduals(fittedModel = P2M3, plot = TRUE) #travel interaction
testDispersion(P2M3_sim)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.72744, p-value = 0.2
## alternative hypothesis: two.sided
P2M4_sim <- simulateResiduals(fittedModel = P2M4, plot = TRUE) #travel independent
testDispersion(P2M4_sim)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.80201, p-value = 0.36
## alternative hypothesis: two.sided
Version <- c("Replication", "Original")
P2M3_Disp <- c(0.73323, 0.859)
P2M4_Disp <- c(0.80546, 0.895)
P2M3_P <- c(0.2, 0.200)
P2M4_P <- c(0.368, 0.328)
P2M3.M4_Dispersion <- data.frame(Version, P2M3_Disp, P2M3_P, P2M4_Disp, P2M4_P)
P2M3.M4_Dispersion %>%
kbl() %>%
kable_styling()
| Version | P2M3_Disp | P2M3_P | P2M4_Disp | P2M4_P |
|---|---|---|---|---|
| Replication | 0.73323 | 0.2 | 0.80546 | 0.368 |
| Original | 0.85900 | 0.2 | 0.89500 | 0.328 |
Like before, these numbers do not exactly match up but show the same results that the fitted models do not show any significant overdispersion.
This was a long tedious process becausee so many of the popular table packages do not support glmmTMB so I had to pull all my model results into tibbles then dataframes to manipulate a final table. I unfortunately couldn’t just use {stargazer} or similiar packages because next-to-nothing is compatible with glmmTMB outputs… which is annoying!
#time alone interaction
P1M1_tidy <- tidy(P1M1)
P1M1_df <- as.data.frame(P1M1_tidy)
#time alone independent
P1M2_tidy <- tidy(P1M2)
P1M2_df <- as.data.frame(P1M2_tidy)
#time feeding interaction
P2M1_tidy <- tidy(P2M1)
P2M1_df <- as.data.frame(P2M1_tidy)
#time feeding independent
P2M2_tidy <- tidy(P2M2)
P2M2_df <- as.data.frame(P2M2_tidy)
#time traveling interaction
P2M3_tidy <- tidy(P2M3)
P2M3_df <- as.data.frame(P2M3_tidy)
#time traveling independent
P2M4_tidy <- tidy(P2M4)
P2M4_df <- as.data.frame(P2M4_tidy)
#time social interaction
P3M1_tidy <- tidy(P3M1)
P3M1_df <- as.data.frame(P3M1_tidy)
#time social independent
P3M2_tidy <- tidy(P3M2)
P3M2_df <- as.data.frame(P3M2_tidy)
#time social adjusted interaction
P3M3_tidy <- tidy(P3M3)
P3M3_df <- as.data.frame(P3M3_tidy)
#time social adjusted independent
P3M4_tidy <- tidy(P3M4)
P3M4_df <- as.data.frame(P3M4_tidy)
In this next step, I specifically pulled out the estimates, se, and p-values I would need for replicating each table.
#Table 3 - GLMM parameter estimates for independent effects models
alone_independent <- P1M2_df[c(1:4),c(4:8)]
feed_independent <- P2M2_df[c(1:4),c(4:8)]
travel_independent <- P2M4_df[c(1:4),c(4:8)]
social_independent <- P3M2_df[c(1:4),c(4:8)]
social_adj_independent <- P3M4_df[c(1:4),c(4:8)]
Changing column names in data frames to more closely match the tables from Lee et al. (2021)
alone_independent$term <- c("Intercept", "Chimpanzee","Infant age class 1.5 < 3","Infant age class 3 < 4.5")
feed_independent$term <- c("Intercept", "Chimpanzee","Infant age class 1.5 < 3","Infant age class 3 < 4.5")
travel_independent$term <- c("Intercept", "Chimpanzee","Infant age class 1.5 < 3","Infant age class 3 < 4.5")
social_independent$term <- c("Intercept", "Chimpanzee","Infant age class 1.5 < 3","Infant age class 3 < 4.5")
social_adj_independent$term <- c("Intercept", "Chimpanzee","Infant age class 1.5 < 3","Infant age class 3 < 4.5")
Preparing independent effects table (table 3) by writing for loop to round the long values
Independent_Table <- rbind.data.frame(alone_independent,feed_independent,travel_independent,social_independent,social_adj_independent)
Independent_Table <- Independent_Table %>%
mutate_if(is.numeric, round, digits = 3)
I spent way too long trying to figure out how to italicize z and P.. decided to just italicize the entire first row.
#Table3
kbl(Independent_Table, col.names = c("Model", "Estimate", "Standard error", "z", "P"), caption = "Table 3 - GLMM parameter estimates for independent effects models") %>%
kable_paper(html_font = "Cambria") %>%
pack_rows(index = c("Time Alone" = 4, "Feeding" = 4, "Travel" = 4, "Social interactions" = 4, "Adjusted social interactions" = 4))%>%
add_indent(1:20, level_of_indent = 10)%>%
row_spec(0, italic = T)
| Model | Estimate | Standard error | z | P |
|---|---|---|---|---|
| Time Alone | ||||
| Intercept | -3.804 | 0.509 | -7.481 | 0.000 |
| Chimpanzee | 2.470 | 0.481 | 5.130 | 0.000 |
| Infant age class 1.5 < 3 | -0.267 | 0.419 | -0.637 | 0.524 |
| Infant age class 3 < 4.5 | -0.140 | 0.405 | -0.345 | 0.730 |
| Feeding | ||||
| Intercept | -0.507 | 0.109 | -4.662 | 0.000 |
| Chimpanzee | -0.023 | 0.126 | -0.180 | 0.857 |
| Infant age class 1.5 < 3 | 0.354 | 0.151 | 2.344 | 0.019 |
| Infant age class 3 < 4.5 | 0.319 | 0.132 | 2.411 | 0.016 |
| Travel | ||||
| Intercept | -1.606 | 0.069 | -23.338 | 0.000 |
| Chimpanzee | -0.093 | 0.080 | -1.155 | 0.248 |
| Infant age class 1.5 < 3 | 0.251 | 0.094 | 2.657 | 0.008 |
| Infant age class 3 < 4.5 | 0.105 | 0.085 | 1.242 | 0.214 |
| Social interactions | ||||
| Intercept | -1.755 | 0.115 | -15.224 | 0.000 |
| Chimpanzee | 0.067 | 0.130 | 0.516 | 0.606 |
| Infant age class 1.5 < 3 | -0.157 | 0.163 | -0.960 | 0.337 |
| Infant age class 3 < 4.5 | -0.249 | 0.162 | -1.534 | 0.125 |
| Adjusted social interactions | ||||
| Intercept | -3.101 | 0.210 | -14.802 | 0.000 |
| Chimpanzee | 0.782 | 0.217 | 3.605 | 0.000 |
| Infant age class 1.5 < 3 | -0.082 | 0.298 | -0.276 | 0.782 |
| Infant age class 3 < 4.5 | -0.031 | 0.229 | -0.135 | 0.892 |
This is a screenshot captured from Lee et al. (2021) of Table 3.
I followed the same steps from above to prepare the data for the interaction effects table.
#Table 4 - GLMM parameter estimates for interaction effect models
alone_interaction <- P1M1_df[c(1,5,6),c(4:8)]
feed_interaction <- P2M1_df[c(1,5,6),c(4:8)]
travel_interaction <- P2M3_df[c(1,5,6),c(4:8)]
social_interaction <- P3M1_df[c(1,5,6),c(4:8)]
social_adj_interaction <- P3M3_df[c(1,5,6),c(4:8)]
Changing term names for consistency with the original article.
alone_interaction$term <- c("Intercept","Chimpanzee × Age 1.5 < 3", "Chimpanzee × Age 3 < 4.5")
feed_interaction$term <- c("Intercept","Chimpanzee × Age 1.5 < 3", "Chimpanzee × Age 3 < 4.5")
travel_interaction$term <- c("Intercept","Chimpanzee × Age 1.5 < 3", "Chimpanzee × Age 3 < 4.5")
social_interaction$term <- c("Intercept","Chimpanzee × Age 1.5 < 3", "Chimpanzee × Age 3 < 4.5")
social_adj_interaction$term <- c("Intercept","Chimpanzee × Age 1.5 < 3", "Chimpanzee × Age 3 < 4.5")
Interaction table prep with rounding and dataframe creation.
Interaction_Table <- rbind.data.frame(alone_interaction,feed_interaction,travel_interaction,social_interaction,social_adj_interaction)
Interaction_Table <- Interaction_Table %>%
mutate_if(is.numeric, round, digits = 3)
#for some reason I had to alter the row names because without this step my table kept printing col numbers and the numbers weren't actually 1-15.. but this fixed it!
rownames(Interaction_Table) <- c(1:15)
#Table 4
kbl(Interaction_Table, col.names = c("Model", "Estimate", "Standard error", "z", "P"), caption = "Table 4 - GLMM parameter estimates for interaction effects models") %>%
kable_paper(html_font = "Cambria") %>%
pack_rows(index = c("Time Alone" = 3, "Feeding" = 3, "Travel" = 3, "Social interactions" = 3, "Adjusted social interactions" = 3))%>%
add_indent(1:15, level_of_indent = 10)%>%
row_spec(0, italic = T)
| Model | Estimate | Standard error | z | P |
|---|---|---|---|---|
| Time Alone | ||||
| Intercept | -3.900 | 0.719 | -5.425 | 0.000 |
| Chimpanzee × Age 1.5 < 3 | 0.680 | 1.302 | 0.523 | 0.601 |
| Chimpanzee × Age 3 < 4.5 | -0.805 | 1.033 | -0.779 | 0.436 |
| Feeding | ||||
| Intercept | -0.464 | 0.116 | -4.013 | 0.000 |
| Chimpanzee × Age 1.5 < 3 | 0.531 | 0.284 | 1.871 | 0.061 |
| Chimpanzee × Age 3 < 4.5 | -0.054 | 0.249 | -0.215 | 0.829 |
| Travel | ||||
| Intercept | -1.593 | 0.078 | -20.415 | 0.000 |
| Chimpanzee × Age 1.5 < 3 | 0.204 | 0.259 | 0.788 | 0.431 |
| Chimpanzee × Age 3 < 4.5 | -0.069 | 0.164 | -0.422 | 0.673 |
| Social interactions | ||||
| Intercept | -1.749 | 0.126 | -13.840 | 0.000 |
| Chimpanzee × Age 1.5 < 3 | -0.126 | 0.324 | -0.390 | 0.696 |
| Chimpanzee × Age 3 < 4.5 | 0.206 | 0.292 | 0.705 | 0.481 |
| Adjusted social interactions | ||||
| Intercept | -2.910 | 0.211 | -13.799 | 0.000 |
| Chimpanzee × Age 1.5 < 3 | 0.506 | 0.495 | 1.023 | 0.306 |
| Chimpanzee × Age 3 < 4.5 | 0.878 | 0.470 | 1.869 | 0.062 |
This is a screenshot captured from Lee et al. (2021) of Table 4.
I first built a theme with the approximate fonts/sizes/format used in Lee et al. (2021), though I did add colors!
leetheme <- theme(plot.title = element_text(family = "Times", "bold", size = 12, hjust = 0.5),
legend.title = element_blank(),
legend.text = element_text(family = "Times", size = (10)),
axis.title = element_text(family = "Times", size = (12)),
axis.text = element_text(family = "Times", size = (12)))
I used the for loops below to add new columns with proportions of each behavior for figure-making because I could not just use the proportion I did in the models built earlier.
#prep for fig1
time_alone$pct <- NULL
for (i in 1:nrow(time_alone)) {
time_alone$pct[i] <- time_alone$alone[i]/time_alone$total[i]
}
#prep for fig2
feed_travel_social$feedpct <- NULL
for (i in 1:nrow(feed_travel_social)) {
feed_travel_social$feedpct[i] <- feed_travel_social$feed[i]/feed_travel_social$total[i]
}
#prep for fig3
feed_travel_social$travelpct <- NULL
for (i in 1:nrow(feed_travel_social)) {
feed_travel_social$travelpct[i] <- feed_travel_social$travel[i]/feed_travel_social$total[i]
}
#prep for fig4
feed_travel_social$socialpct <- NULL
for (i in 1:nrow(feed_travel_social)) {
feed_travel_social$socialpct[i] <- feed_travel_social$social[i]/feed_travel_social$total[i]
}
#prep for fig5
feed_travel_social$adjustpct <- NULL
for (i in 1:nrow(feed_travel_social)) {
feed_travel_social$adjustpct[i] <- feed_travel_social$social_adj[i]/feed_travel_social$social[i]
}
I used the summarise function to create new dataframes with the summarized values for the different behaviors. I grouped the frames by species and age to make it easier to plot.
#Data for figure 1
alone.data <- time_alone %>%
group_by(species, age) %>%
summarise(AvgPct = mean(pct), sd = sd(pct), n = n(), se = sd/sqrt(n))
#Data for figures 2-5
behav.data <- feed_travel_social%>%
group_by(species, age) %>%
summarise(AvgPct.Feed = mean(feedpct), feed.sd = sd(feedpct), feed.n = n(), feed.se = feed.sd/sqrt(feed.n), AvgPct.Travel = mean(travelpct), travel.sd = sd(travelpct), travel.n = n(), travel.se = travel.sd/sqrt(travel.n), AvgPct.Social = mean(socialpct), social.sd = sd(socialpct), social.n = n(), social.se = social.sd/sqrt(social.n), AvgPct.Adjust = mean(adjustpct), adjust.sd = sd(adjustpct), adjust.n = n(), adjust.se = adjust.sd/sqrt(adjust.n))
Figure 1 - Mean ± standard error percentage of time that lactating females spent ranging in parties with only their immature offspring.
fig1 <- ggplot(alone.data, aes(age, AvgPct)) + theme_classic() + leetheme + ggtitle("Time Alone")+
geom_bar(aes(fill = species), stat = "identity", position = position_dodge(0.8), width = 0.9)+
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0,.5))+
scale_x_discrete(labels = c("0 < 1.5", "1.5 < 3", "3 < 4.5"))+
xlab("\nAge of youngest infant (years)") + ylab("Percent of focal observation time\n")+
geom_errorbar(aes(ymin = AvgPct-se, ymax = AvgPct+se, group = species),width = 0.1, position = position_dodge(0.8))+
scale_fill_manual(values = c("darkseagreen4", "darkseagreen3"))
fig1
Figure 2 - Mean ± standard error percentage of time that lactating females spent feeding.
fig2 <- ggplot(behav.data, aes(age, AvgPct.Feed)) + theme_classic() + leetheme + ggtitle("Feeding")+
geom_bar(aes(fill = species), stat = "identity", position = position_dodge(0.8), width = 0.9)+
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0,1))+
scale_x_discrete(labels = c("0 < 1.5", "1.5 < 3", "3 < 4.5"))+
xlab("\nAge of youngest infant (years)") + ylab("Percent of focal observation time\n")+
geom_errorbar(aes(ymin = AvgPct.Feed-feed.se, ymax = AvgPct.Feed+feed.se, group = species),width = 0.1, position = position_dodge(0.8))+
scale_fill_manual(values = c("darkseagreen4", "darkseagreen3"))
fig2
Figure 3 - Mean ± standard error percentage of time that lactating females spent traveling.
fig3 <- ggplot(behav.data, aes(age, AvgPct.Travel)) + theme_classic() + leetheme + ggtitle("Travel")+
geom_bar(aes(fill = species), stat = "identity", position = position_dodge(0.8), width = 0.9)+
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0,.5))+
scale_x_discrete(labels = c("0 < 1.5", "1.5 < 3", "3 < 4.5"))+
xlab("\nAge of youngest infant (years)") + ylab("Percent of focal observation time\n")+
geom_errorbar(aes(ymin = AvgPct.Travel-travel.se, ymax = AvgPct.Travel+travel.se, group = species), width = 0.1, position = position_dodge(0.8))+
scale_fill_manual(values = c("darkseagreen4", "darkseagreen3"))
fig3
Figure 4 - Mean ± standard error percentage of time that lactating females spent engaged in social interactions with any community member.
fig4 <- ggplot(behav.data, aes(age, AvgPct.Social)) + theme_classic() + leetheme + ggtitle("Social Interactions")+
geom_bar(aes(fill = species), stat = "identity", position = position_dodge(0.8), width = 0.9)+
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0,.5))+
scale_x_discrete(labels = c("0 < 1.5", "1.5 < 3", "3 < 4.5"))+
xlab("\nAge of youngest infant (years)") + ylab("Percent of focal observation time\n")+
geom_errorbar(aes(ymin = AvgPct.Social-social.se, ymax = AvgPct.Social+social.se, group = species), width = 0.1, position = position_dodge(0.8))+
scale_fill_manual(values = c("darkseagreen4", "darkseagreen3"))
fig4
Figure 5 - Mean ± standard error percentage of social interactions in which lactating females spent engaged in social interactions with individuals other than their immature offspring.
fig5 <- ggplot(behav.data, aes(age, AvgPct.Adjust)) + theme_classic() + leetheme + ggtitle("Adjusted Social Interactions")+
geom_bar(aes(fill = species), stat = "identity", position = position_dodge(0.8), width = 0.9)+
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0,1))+
scale_x_discrete(labels = c("0 < 1.5", "1.5 < 3", "3 < 4.5"))+
xlab("\nAge of youngest infant (years)") + ylab("Percent of social interactions\n")+
geom_errorbar(aes(ymin = AvgPct.Adjust-adjust.se, ymax = AvgPct.Adjust+adjust.se, group = species),width = 0.1, position = position_dodge(0.8))+
scale_fill_manual(values = c("darkseagreen4", "darkseagreen3"))
fig5
This took a long time! I had a tough time organizing all my thoughts/data. Especially when it came to replicating so many figures
Social Interaction (P3M1)
The response variable for this first model was calculated by dividing the number of point samples that a given lactating female was engaged in social interactions during each infant age class by the total number of good observations collected on that lactating female during the given infant age class. The model includes the interaction between species and infant agee.
look at model/anova results and compare with paper
I again checked the results of the model summary and Anova matched Lee et al.(2021). And they do! We can see from the Anova results above that the interaction between species and infant age was not significant when looking at proportion of time spent in social interaction. The slight difference between my replication and the original is just due to rounding. The next step is to refit the social model with species and age as independent fixed-effects.